AITopics | exponential convergence

Collaborating Authors

exponential convergence

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Wasserstein Contraction of Coordinate Ascent Variational Inference

Caprio, Rocco, Corenflos, Adrien, Power, Sam

arXiv.org Machine LearningMay-29-2026

Finding approximations to an intractable probability distribution π of interest (usually known only up to a normalizing constant) is a key problem in scientific computing. Variational Inference stands out as a particularly attractive tool for this task, owing to its statistical and computational efficiency, and it has been the framework underlying many advances in computational statistics over the past half century (Parisi, 1980; Hinton and Van Camp, 1993; Jordan et al., 1999; Bishop and Nasrabadi, 2006). The central idea is to seek a tractable approximation to π within a chosen family of tractable distributions Q by minimizing a divergence to π over that'variational' family. Often, it is convenient or well-motivated to work with the family of product (or tensor, or factorized) distributions Q = P m, and define optimality through minimisation of the Kullback-Leibler (KL) divergence (also'relative entropy') min KL(ϱ||π): ϱ P m . A key practical aspect of working with this particular loss function is that in solving the associated optimisation problem, one is only required to compute expectations under the tractable variational distribution ϱ, rather than under the intractable target distribution π. In Bayesian statistics, π typically represents the joint posterior distribution of latent variables z Z and some parameters β B given observed data y Y. In these cases, we often choose m = 2 and seek the best variational approximation µ(dz) ν(dβ) to π to solve min KL(µ ν||π): µ P(Z), ν P(B) . The coordinate ascent variational inference algorithm (CAVI, Bishop and Nasrabadi, 2006; Blei et al., 2017) solves this problem by iteratively minimizing the Kullback-Leibler divergence with respect to one element at a time: given a starting point ν0, it iterates µk:= argmin

artificial intelligence, inequality, machine learning, (14 more...)

arXiv.org Machine Learning

2605.30253

Country: Asia > Middle East > Jordan (0.24)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

8c96b559340daa7bb29f56ccfbbc9c2f-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-15-2026, 19:19:55 GMT

artificial intelligence, machine learning, opération, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.95)

Add feedback

Robust Canonicalization through Bootstrapped Data Re-Alignment

Schmidt, Johann, Stober, Sebastian

arXiv.org Artificial IntelligenceOct-10-2025

Fine-grained visual classification (FGVC) tasks, such as insect and bird identification, demand sensitivity to subtle visual cues while remaining robust to spatial transformations. A key challenge is handling geometric biases and noise, such as different orientations and scales of objects. Existing remedies rely on heavy data augmentation, which demands powerful models, or on equivariant architectures, which constrain expressivity and add cost. Canonicalization offers an alternative by shielding such biases from the downstream model. In practice, such functions are often obtained using canonicalization priors, which assume aligned training data. Unfortunately, real-world datasets never fulfill this assumption, causing the obtained canonicalizer to be brittle. We propose a bootstrapping algorithm that iteratively re-aligns training samples by progressively reducing variance and recovering the alignment assumption. We establish convergence guarantees under mild conditions for arbitrary compact groups, and show on four FGVC benchmarks that our method consistently outperforms equivariant, and canonicalization baselines while performing on par with augmentation.

artificial intelligence, international conference, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2510.08178

Country: Europe > Germany (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

8c96b559340daa7bb29f56ccfbbc9c2f-Supplemental-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 01:03:21 GMT

artificial intelligence, machine learning, opération, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.95)

Add feedback

A Dimension-Decomposed Learning Framework for Online Disturbance Identification in Quadrotor SE(3) Control

Gao, Tianhua

arXiv.org Artificial IntelligenceOct-6-2025

Quadrotor stability under complex dynamic disturbances and model uncertainties poses significant challenges. One of them remains the underfitting problem in high-dimensional features, which limits the identification capability of current learning-based methods. To address this, we introduce a new perspective: Dimension-Decomposed Learning (DiD-L), from which we develop the Sliced Adaptive-Neuro Mapping (SANM) approach for geometric control. Specifically, the high-dimensional mapping for identification is axially ``sliced" into multiple low-dimensional submappings (``slices"). In this way, the complex high-dimensional problem is decomposed into a set of simple low-dimensional tasks addressed by shallow neural networks and adaptive laws. These neural networks and adaptive laws are updated online via Lyapunov-based adaptation without any pre-training or persistent excitation (PE) condition. To enhance the interpretability of the proposed approach, we prove that the full-state closed-loop system exhibits arbitrarily close to exponential stability despite multi-dimensional time-varying disturbances and model uncertainties. This result is novel as it demonstrates exponential convergence without requiring pre-training for unknown disturbances and specific knowledge of the model.

artificial intelligence, machine learning, neural network, (17 more...)

arXiv.org Artificial Intelligence

2510.031

Genre: Research Report (0.50)

Industry: Energy > Renewable (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Dimension-Decomposed Learning for Quadrotor Geometric Attitude Control with Almost Global Exponential Convergence on SO(3)

Gao, Tianhua, Izumita, Masashi, Tomita, Kohji, Kamimura, Akiya

arXiv.org Artificial IntelligenceAug-29-2025

This paper introduces a lightweight and interpretable online learning approach called Dimension-Decomposed Learning (DiD-L) for disturbance identification in quadrotor geometric attitude control. As a module instance of DiD-L, we propose the Sliced Adaptive-Neuro Mapping (SANM). Specifically, to address underlying underfitting problems, the high-dimensional mapping for online identification is axially ``sliced" into multiple low-dimensional submappings (slices). In this way, the complex high-dimensional problem is decomposed into a set of simple low-dimensional subtasks addressed by shallow neural networks and adaptive laws. These neural networks and adaptive laws are updated online via Lyapunov-based adaptation without the persistent excitation (PE) condition. To enhance the interpretability of the proposed approach, we prove that the state solution of the rotational error dynamics exponentially converges into an arbitrarily small ball within an almost global attraction domain, despite time-varying disturbances and inertia uncertainties. This result is novel as it demonstrates exponential convergence without requiring pre-training for unseen disturbances and specific knowledge of the model. To our knowledge in the quadrotor control field, DiD-L is the first online learning approach that is lightweight enough to run in real-time at 400 Hz on microcontroller units (MCUs) such as STM32, and has been validated through real-world experiments.

artificial intelligence, machine learning, neural network, (18 more...)

arXiv.org Artificial Intelligence

2508.14422

Genre: Research Report (0.81)

Industry:

Education > Educational Setting (0.68)
Information Technology (0.67)
Transportation (0.46)
Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

NeurIPS_2022_Kernel_Attention (9)

Alexander Moreno

Neural Information Processing SystemsAug-15-2025, 20:54:07 GMT

Lacking closed form expressions for the context vector, we use numerical integration: we prove exponential convergence for both families.

exponential family, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Israel (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)

Genre: Research Report (0.46)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science (0.93)

Add feedback

From Text to Trajectories: GPT-2 as an ODE Solver via In-Context

Ma, Ziyang, Zhou, Baojian, Yang, Deqing, Xiao, Yanghua

arXiv.org Artificial IntelligenceAug-6-2025

In-Context Learning (ICL) has emerged as a new paradigm in large language models (LLMs), enabling them to perform novel tasks by conditioning on a few examples embedded in the prompt. Yet, the highly nonlinear behavior of ICL for NLP tasks remains poorly understood. To shed light on its underlying mechanisms, this paper investigates whether LLMs can solve ordinary differential equations (ODEs) under the ICL setting. We formulate standard ODE problems and their solutions as sequential prompts and evaluate GPT-2 models on these tasks. Experiments on two types of ODEs show that GPT-2 can effectively learn a meta-ODE algorithm, with convergence behavior comparable to, or better than, the Euler method, and achieve exponential accuracy gains with increasing numbers of demonstrations. Moreover, the model generalizes to out-of-distribution (OOD) problems, demonstrating robust extrapolation capabilities. These empirical findings provide new insights into the mechanisms of ICL in NLP and its potential for solving nonlinear numerical problems.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2508.03031

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Neural Policy Iteration for Stochastic Optimal Control: A Physics-Informed Approach

Kim, Yeongjong, Kim, Yeoneung, Kim, Minseok, Cho, Namkyeong

arXiv.org Artificial IntelligenceAug-5-2025

We propose a physics-informed neural network policy iteration (PINN-PI) framework for solving stochastic optimal control problems governed by second-order Hamilton--Jacobi--Bellman (HJB) equations. At each iteration, a neural network is trained to approximate the value function by minimizing the residual of a linear PDE induced by a fixed policy. This linear structure enables systematic $L^2$ error control at each policy evaluation step, and allows us to derive explicit Lipschitz-type bounds that quantify how value gradient errors propagate to the policy updates. This interpretability provides a theoretical basis for evaluating policy quality during training. Our method extends recent deterministic PINN-based approaches to stochastic settings, inheriting the global exponential convergence guarantees of classical policy iteration under mild conditions. We demonstrate the effectiveness of our method on several benchmark problems, including stochastic cartpole, pendulum problems and high-dimensional linear quadratic regulation (LQR) problems in up to 10D.

artificial intelligence, iteration, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2508.01718

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Some remarks on gradient dominance and LQR policy optimization

Sontag, Eduardo D.

arXiv.org Artificial IntelligenceJul-17-2025

Solutions of optimization problems, including policy optimization in reinforcement learning, typically rely upon some variant of gradient descent. There has been much recent work in the machine learning, control, and optimization communities applying the Polyak-Łojasiewicz Inequality (PLI) to such problems in order to establish an exponential rate of convergence (a.k.a. ``linear convergence'' in the local-iteration language of numerical analysis) of loss functions to their minima under the gradient flow. Often, as is the case of policy iteration for the continuous-time LQR problem, this rate vanishes for large initial conditions, resulting in a mixed globally linear / locally exponential behavior. This is in sharp contrast with the discrete-time LQR problem, where there is global exponential convergence. That gap between CT and DT behaviors motivates the search for various generalized PLI-like conditions, and this talk will address that topic. Moreover, these generalizations are key to understanding the transient and asymptotic effects of errors in the estimation of the gradient, errors which might arise from adversarial attacks, wrong evaluation by an oracle, early stopping of a simulation, inaccurate and very approximate digital twins, stochastic computations (algorithm ``reproducibility''), or learning by sampling from limited data. We describe an ``input to state stability'' (ISS) analysis of this issue. The second part discusses convergence and PLI-like properties of ``linear feedforward neural networks'' in feedback control. Much of the work described here was done in collaboration with Arthur Castello B. de Oliveira, Leilei Cui, Zhong-Ping Jiang, and Milad Siami.

artificial intelligence, convergence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2507.10452

Country: North America > United States (0.68)

Genre: Research Report (0.50)

Industry: Government > Military (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback